Skip to content

Add the affected program address to each InstructionError #74

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

steveluscher
Copy link
Contributor

Problem

Consider a transaction error that originates from a cross-program invocation (ie. an inner instruction). Currently TransactionError returns you the index of the outer instruction, which in no way helps you to correlate the content of the error message with the actual program from whence it came.

Summary of changes

Added program_address to the InstructionError type, which will help consumers correlate instruction errors with their actual source.

Addresses anza-xyz/agave#5152.

@steveluscher steveluscher requested a review from a team as a code owner March 6, 2025 21:44
@steveluscher
Copy link
Contributor Author

steveluscher commented Mar 6, 2025

This is a bit of a spitball pr. Things I'm thinking about:

  1. This will require a major version bump, because Rust. It will probably require a major version bump in a lot of downstream packages too. Perhaps all software ever written.
  2. I imagine that this will increase the storage cost of transactions by the 32 bytes of the pubkey. Maybe this is fine, but perhaps there is an alternative, like storing the index of the program in the accounts array and then reconstructing the program address later (ie. in the RPC method that loads and vends the InstructionError)?
  3. We'll have to teach the program runtime to include the program address when it throws InstructionErrors.

@steveluscher steveluscher force-pushed the add_program_address_to_instruction_error branch 3 times, most recently from fcc12a6 to 270d3c8 Compare March 6, 2025 22:40
@steveluscher steveluscher force-pushed the add_program_address_to_instruction_error branch from 270d3c8 to d7fbd88 Compare March 6, 2025 22:42
/// third element indicates the address of the program that raised the error, if applicable; the
/// error could after all have been raised during a cross-program invocation (ie. in an inner
/// instruction).
InstructionError(u8, InstructionError, Option<Pubkey>),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't have a program address in cases where the error is InstructionError::UnsupportedProgramId, so this is made an Option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate what cases UnsupportedProgramId is for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, for sure. It's thrown, for instance, here:

https://github.com/anza-xyz/agave/blob/1bcf743f445270407bccf55681f74e06ef8b9a48/program-runtime/src/invoke_context.rs#L250-L252

ie. if you tried to issue an instruction without providing the address of a program.

The larger point here is that there exist InstructionErrors that are more like ‘there was a structural problem with this instruction’ rather than ‘a program died and here's why.’ The latter has a program address associated with it while the former may not.

@kevinheavey
Copy link
Contributor

btw this is a breaking change, but we're already due a bump to 3.0 because of #46

/// third element indicates the address of the program that raised the error, if applicable; the
/// error could after all have been raised during a cross-program invocation (ie. in an inner
/// instruction).
InstructionError(u8, InstructionError, Option<Pubkey>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding Option<Pubkey> here means TransactionError is now 64 bytes instead of 32 bytes. I'm not sure how often we move/clone these on the agave side, but this could have some small perf implications. I'm aware of at least a few places we clone these values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ this is not a comment on why we shouldn't do this - just a note that we should be aware of this size change, and be more aware of where we're cloning these in our processing pipeline so we can avoid performance hits

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels bad. I should probably endeavour to store just the index of the program account in the accounts array, and then make the downstream thing that really wants to know what the program was look it up.

I'll rifle through the Agave code and see if that's tractable (ie. the instruction will need to be available wherever something needs know what program address was involved).

@steveluscher
Copy link
Contributor Author

So, here's what I'm thinking. There are three ways to approach this.

Option 1 – Add program address to TransactionError::InstructionError

- TransactionError::InstructionError(u8, InstructionError)
+ TransactionError::InstructionError(u8, InstructionError, Option<Pubkey>)
  • the address (or address index) will have to be Option, because not all InstructionErrors are related to a program
  • breaks people who construct TransactionError::InstructionError (ie. the SVM)
  • does not break people who construct InstructionError

Option 2 – Add program address to every InstructionError that is related to a program

  ArithmeticOverflow,  // Leave as is; not related to a program
- InstructionError::CallDepth,
+ InstructionError::CallDepth(Pubkey)
  • does not break people who construct TransactionError::InstructionError (ie. the SVM)
  • breaks people who construct InstructionError

Note

There still might exist some InstructionErrors that sometimes have programs related to them and other times do not. These would still, unfortunately for complexity, be Option<Pubkey>

Option 3 – Create a TransactionError::InstructionErrorCausedByProgram type

  InstructionError(u8, InstructionError),
+
+ InstructionErrorCausedByProgram(u8, InstructionError, Pubkey)
  • no need for making the address (or address index) Option; just return the relevant TransactionError::InstructionError
  • would need to make a separate enum to represent the union of both, or use Either (ie. functions would have to have return types like Result<(), Either<InstructionError, InstructionErrorCausedByProgram>)

@joncinque
Copy link
Collaborator

So good news, I went through all of the InstructionError variants, and they pretty much all pertain to a certain program, or preparing the execution of a certain program. [1]

To go with that, the TransactionError::InstructionError variant uses a u8 for the index of the top-level instruction in the transaction, which means that there's already some notion of "this error requires some greater context to truly understand".

I'm down to go with the option 1, but using a u8 for the index of the account in the transaction, as mentioned earlier. Looking at the storage protos for transaction errors, I think we should just be able to add a new field for the originating account: https://github.com/anza-xyz/agave/blob/bc09ffa335d9773fd6c4b354e61c44b8fc36724a/storage-proto/proto/transaction_by_addr.proto#L21

It'll be a bit of a slog to get through all the changes, but it should be mostly straightforward. Happy to hear other ideas though!

[1] if I'm incorrect, then we should fix the usage of that particular error variant rather than torpedo the design

@steveluscher
Copy link
Contributor Author

steveluscher commented Mar 18, 2025

…pretty much all pertain to a certain program, or preparing the execution of a certain program

I started then scrapped a PR that did option 1 and I seem to remember it being really hard to obtain the program address in places. Here's cargo check after adding a u8 to TransactionError::InstructionError (gist).

I can try again, but a few on spec:

  • UnsupportedProgramId doesn't, by its very nature. (link)
  • Do MissingAccount and NotEnoughAccountKeys ever happen before we know what the program address is? (link, link)

The big problem is here, because it takes in a dynamic error after long having forgotten what program is responsible for the error: (link)

So basically you have to follow process_precompile and process_instruction all the way down, and make sure that they throw the responsible program address up through the Result, which means now you're not throwing an InstructionError there, you're throwing either an InstructionErrorWithResponsibleProgram or a (InstructionError, u8). I found this to get really hairy. (link)

@joncinque
Copy link
Collaborator

UnsupportedProgramId doesn't, by its very nature. (link)

That one looks at the owner of the first program account in the transaction context, so we could probably use that, right? https://github.com/anza-xyz/agave/blob/ccbf3f25f332cf4bfa4f1a9bf27db8d3333b3064/program-runtime/src/invoke_context.rs#L523

Do MissingAccount and NotEnoughAccountKeys ever happen before we know what the program address is? (link, link)

If they do, then that's an issue with their usage, which should be fixable.

The big problem is here, because it takes in a dynamic error after long having forgotten what program is responsible for the error: (link)

That plumbing does look a bit involved, but should end up similar to any big refactor.

@joncinque joncinque added the breaking PR contains breaking changes label Mar 31, 2025
@buffalojoec
Copy link
Contributor

buffalojoec commented Apr 7, 2025

@joncinque @steveluscher Hey guys, sorry to come in here late!

Another major issue I see with obtaining the program ID for an inner instruction from the transaction accounts is that CPI callees will eventually no longer be in that list at all.
https://github.com/solana-foundation/solana-improvement-documents/blob/main/proposals/0163-lift-cpi-caller-restriction.md

If a program decides to hard-code TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA and CPI to it, the RPC isn't going to know about it without parsing either the transaction logs or the inner instructions payload.

Rather than plumbing callee IDs all the way from SVM, what about just enabling inner instructions recording on Bank all of the time, and walking that payload to grab the program ID? The array of inner instructions comes back empty when there's an error, but we can just change that behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking PR contains breaking changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants